-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor audio handling #277
base: master
Are you sure you want to change the base?
Conversation
The Windows build seems to be failing because of the latest commit involving macOS in master |
i like. build error will go away if you merge master back in. for notification sounds: you might need to fiddle with cmake for this (in case notification sounds are enabled and voice is disabled). although maybe u can just change the include guard from WITH_VOICE to WITH_MINIAUDIO. other issue would be how to actually play it considering they use the high-level engine api from miniaudio while voice uses low-level device api. we can just shove both into the AudioManager but if we want to avoid that duplication, we can either use the high-level api for everything incl voice (the engine config can take context and playback device and per-user devices seems niche but im ok with doing that if its not too much of a mess to work with |
I don't think that's possible, looks like engine API is meant for only for playback. For captures we'll still have to resort to low-level API, especially if we want to support split voice audio, since the audio devices will need to be started/stopped dynamically.
My main use case for this would be for recording voice calls into separate audio channels in OBS, so that I can manipulate each user's audio independently. |
true. for playback then it might be better to just use engine api and mix device init/callback stuff like i said. capture seems like it would remain the same
ok sure im cool with that. as long as it doesnt clobber up the ui for most people who wont need that |
AudioManager: * Audio devices get started/stopped on voice connect/disconnect * Reduce code duplication with Open/TryOpen functions * Don't store device configs and ids AudioDevices: * Rename [Get/Set]ActivePlaybackDevice to [Get/Set]ActivePlaybackDeviceIter * Declare getters as const and [[nodiscard]] * Reduce code duplication by using Get[Playback/Capture]DeviceIDFromModel * Add GetActive[Playback/Capture]
- Replace std::move with reference - Log warning instead of assert on opening/closing devices - Remove branching in logging + extract into a function
Currently rewriting and splitting |
uhh i probably figured it was simpler since im not doing any fancy processing and opus's float function just internally converts to s16. rnnoise doesnt actually use f32 it uses s16 also. if you think you get something out of switching over to f32 then go for it |
Lines 571 to 575 in b3a8356
|
That's only the case if it was compiled with #ifdef FIXED_POINT
// snip
#else
opus_int32 opus_encode_float(OpusEncoder *st, const float *pcm, int analysis_frame_size,
unsigned char *data, opus_int32 out_data_bytes)
{
int frame_size;
frame_size = frame_size_select(analysis_frame_size, st->variable_duration, st->Fs);
return opus_encode_native(st, pcm, frame_size, data, out_data_bytes, 24,
pcm, analysis_frame_size, 0, -2, st->channels, downmix_float, 1);
} |
Using
So we might just use
|
ok sure lets do f32 then. bonus points for it also being easier to reason with since its easy -1 to 1 |
forgot to mention but i rewrote the buffer stuff just a bit to add a jitter buffer since some ppl were complaining about sound so lemme know if u have questions on how that works. admittedly its not the best implementation |
just kidding im gonna revert that tomorrow cuz it looks like its causing more problems than it solves |
Oh okay, I'll see if I can implement it anyway. It looks like the original code uses |
you do get underruns but it just results in a little crackle. i encounter it sometimes but its not too bad but i guess its worse if you have bad internet. i think something like libspeexdsp's jitter buffer would be ideal but i dont want to bring it in as a dependency tbh. every jitter buffer i found seems to operate on RTP payloads too so i guess we should also (which my attempt didnt do lol) |
I meant that Deque is also a dynamically sized meaning it can cause a reallocation, which is something is generally discouraged in real-time audio processing. I think we should implement a proper ring buffer with fixed size, so that Opus decoder stops pushing samples if the audio thread can't keep up. |
yeah true. i thought about that but chose to worry about it later 😼. miniaudio has a ring buffer for audio too https://miniaud.io/docs/manual/index.html#RingBuffers |
Haven't seen that, should be easy to integrate
I don't think this would happen with ring buffer, since the writer (decoder) is going to discard new frames instead of the old ones in case the reader (audio callback) cannot keep up. The reader can then detect that the buffer is full and and apply something like interpolation to catch up with the writer. This might distort the audio (higher pitch), but it won't end up choppy. |
maybe. might be worth looking at how something like mumble does it. i know they use libspeexdsp for jitter so we can see how that interacts |
I think this should be all |
still need to fix the build when compiling without audio support so a bunch of stuff probably needs to be wrapped in |
I think the right approach would be not including the sources in CMake at all, not only does it increase the compilation time, but also having to include the guard in each file seems tedious. Just did that, seems to compile and work fine. |
Uh nevermind |
The build seems to be failing now because of LTO. Might be this issue? |
maybe this? diasurgical/devilutionX@8693a0f |
Sure, I'll see if disabling decloning for those specific files works, if not, we can try disabling it for the entire executable. |
8c9aa44
to
e323410
Compare
I have no idea what's breaking it now, I'll just disable LTO for Windows |
theres a good bit of missing headers. it builds without them thanks to precompiled headers meaning everything ends up transiently included but they should be added anyways (it breaks my intellisense too). you can disable precompiled headers in cmake and it should give you errors where stuff is missing. adding headers wont hurt compile time (at least in my testing) if they are included in the precompiled headers |
ok giving it some more testing
|
e6191d9 should address point 2. shouldnt be too hard to port over here |
Seems to happen if you use RNNoise as VAD and then switch it to gate. |
Not sure what could be wrong here, |
Hey, here is another PR from me!
Made a few changes to audio handling and cleaned up some of the code. In particular audio devices get started/stopped on demand upon entering voice call. This allows audio backends like PipeWire to suspend sinks when they there isn't anything active connected to them.
AudioManager
:Open/TryOpen
functionsAudioDevices
:[Get/Set]ActivePlaybackDevice
to[Get/Set]ActivePlaybackDeviceIter
const
and[[nodiscard]]
Get[Playback/Capture]DeviceIDFromModel
GetActive[Playback/Capture]
TODO:
SystemAudio